Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniformize kwargs for image-text-to-text processors #32544

Conversation

yonigozlan
Copy link
Member

@yonigozlan yonigozlan commented Aug 8, 2024

What does this PR do?

Adds uniformized processors kwargs following #31911 for the following image-text-to-text models:

  • Align
  • Fuyu
  • InstructBlip
  • Kosmos-2
  • LLaVa-NeXT
  • Pix2Struct

I will open a separate PR for Idefics/2 as their processors are quite different from the others.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@molbap @zucchini-nlp @amyeroberts

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan mentioned this pull request Aug 8, 2024
40 tasks
@yonigozlan yonigozlan marked this pull request as ready for review August 8, 2024 22:32
Copy link
Contributor

@molbap molbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice! Jumping back to my own processors merging after that, let's go

src/transformers/models/kosmos2/processing_kosmos2.py Outdated Show resolved Hide resolved
src/transformers/processing_utils.py Outdated Show resolved Hide resolved
tests/models/fuyu/test_processing_fuyu.py Show resolved Hide resolved
tests/models/fuyu/test_processing_fuyu.py Show resolved Hide resolved
Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks! Looks good overall, mainly concerned about not breaking BC for users. Left a few comments

tests/models/fuyu/test_processing_fuyu.py Outdated Show resolved Hide resolved
tests/models/llava_next/test_processor_llava_next.py Outdated Show resolved Hide resolved
Comment on lines 176 to 177
# Temporary fix for "paddding_side" in init_kwargs
_ = self.tokenizer.init_kwargs.pop("padding_side", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very clear why we need this hack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's related to AutoTokenizer mapping, @yonigozlan can say a bit more :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zucchini-nlp From what I’ve seen, some tokenizers accept padding_side as a call argument, while others don’t. But when you save weights and configs using a tokenizer loaded with AutoTokenizer and then reload them later, all possible init kwargs (including padding_side) get added to the tokenizer’s init_kwargs, even if they weren’t explicitly specified in the first place. So when merging the tokenizer.init_kwargs with the output_kwargs, if the tokenizer doesn’t support padding_side in its call function, it will cause an error.

Hopefully, that makes sense - it’s still a bit unclear for me too, to be honest. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand correctly. The basic TextKwargs have padding-side so seems like it should not be causing errors and should be assigning a kwarg to be used later, when calling the tokenizer. If users don't pass anything it will be the default kwarg from init time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the main problem is that padding_side is included in the basic TextKwargs when some tokenizer encode function don't accept it as an argument such as batch_encode_plus for PretrainedTokenizerFast:

def _batch_encode_plus(
self,
batch_text_or_text_pairs: Union[
List[TextInput], List[TextInputPair], List[PreTokenizedInput], List[PreTokenizedInputPair]
],
add_special_tokens: bool = True,
padding_strategy: PaddingStrategy = PaddingStrategy.DO_NOT_PAD,
truncation_strategy: TruncationStrategy = TruncationStrategy.DO_NOT_TRUNCATE,
max_length: Optional[int] = None,
stride: int = 0,
is_split_into_words: bool = False,
pad_to_multiple_of: Optional[int] = None,
return_tensors: Optional[str] = None,
return_token_type_ids: Optional[bool] = None,
return_attention_mask: Optional[bool] = None,
return_overflowing_tokens: bool = False,
return_special_tokens_mask: bool = False,
return_offsets_mapping: bool = False,
return_length: bool = False,
verbose: bool = True,
split_special_tokens: bool = False,
) -> BatchEncoding:

So maybe it shouldn't be in TextKwargs at all? Do we have an example of a tokenizer that needs to set padding_side at call time rather than at init time? What do you think @molbap ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, no padding_side is used at call time it seems - might have been an oversight on my end to include it in the first place. If that's the case we can check it and make sure it is indeed not used, and if that's the case removing it should be doable without breaking BC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any instances of padding_side being set at call time in Transformers, so I don't think removing it will break anything :) .
I used this regex to look in the library: (?=.*\bpadding_side\b)(?=.*\bprocessor\b)\s*(.*\S.*)
Which looks for lines where "padding_side" and "processor" are both used (and ignores leading whitespaces to avoid duplicate results)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, afaik padding side currently can be set only as tokenizer.padding_side="left". Not sure if it will used any time in future as call time argument, so I am for removing it

@@ -57,28 +90,10 @@ def __call__(
self,
images: Optional[ImageInput] = None,
text: Union[TextInput, PreTokenizedInput, List[TextInput], List[PreTokenizedInput]] = None,
text_pair: Optional[Union[PreTokenizedInput, List[PreTokenizedInput]]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO kwargs like text_pair do not belong in the TextKwargs. Firstly, because it's not a kwarg used to change the way a text is tokenized. Secondly, it might break BC because most users wouldn't explicitly pass text_pair="My text" (e.g. our example code https://huggingface.co/docs/transformers/en/model_doc/udop#transformers.UdopForConditionalGeneration)

Same might apply to text_target, text_pair_target. Maybe we should leave it as is? Also cc @molbap , would like to hear your opinion :)

Copy link
Contributor

@molbap molbap Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree they don't change tokenization, however I think they belong there though, just because they were present before and are part of the tokenizer signature: from udop docs we say

Additionally, it also supports passing text_target and text_pair_target to the tokenizer, which can be used to prepare labels for language modeling tasks.

so even if it does not explicitly change the text, it's still a tokenizer option so in terms of separation of concerns, for me it belongs here! (mostly, because it was there before)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can see how that would be a problem for backward compatibility. Maybe we should deprecate the use of text_pair, text_target etc. as args and not kwargs? Especially since they are optional and other kwargs can be used without them (e.g. inputs = processor(image, words, boxes=boxes, return_tensors="pt") in UdopModel doc). However I'm not sure how we could catch the use of too many args to provide a deprecation warning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 💯 for deprecating the usage here and not leave these args here, as we really want a unified API I don't want to create exceptions. Even if users might not use it that way/use the previous version for a while, the end goal is that other libs can also use processors with a single API, just having to inspect types to understand what a processor does.
One way to catch the deprecated usage could be to simply check if these args are present in the kwargs (and different from their default) or not instead of relying on length. You could also use inspect directly for that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very strong about it, we can keep it as TextKwargs as long as we don't break BC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done this which is a bit hacky but should preserve BC:

if "text_pair " not in output_kwargs["text_kwargs"]:
warnings.warn(
"No `text_pair` kwarg was detected. The use of `text_pair` as an argument without specifying it explicitely as `text_pair=` will be deprecated in future versions."
)
# for BC
if audio is not None:
output_kwargs["text_kwargs"]["text_pair"] = audio

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, let's do it logger,warning_once and move the warning below, so that users see it only if passing text_pair without indicating text_pair=my_text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly not a big fan of this - we shouldn't be using kwargs for a hack for which they are not advertised. Will take a look this afternoon and try to suggest something

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I'm kind of out of ideas, I'll trust you on finding something clean, at worst we can do as you say but add a deprecation cycle for a few versions later. Closest I could find that does modify the signature is simply by capturing all extra args, like so

    def __call__(
        self,
        images: Optional[ImageInput] = None,
        text: Union[TextInput, PreTokenizedInput, List[TextInput], List[PreTokenizedInput]] = None,
        *args,
        audio=None,
        videos=None,
        **kwargs: Unpack[UdopProcessorKwargs],
    ) -> BatchFeature:
        """
        This method first forwards the `images` argument to [`~UdopImageProcessor.__call__`]. In case
        [`UdopImageProcessor`] was initialized with `apply_ocr` set to `True`, it passes the obtained words and
        bounding boxes along with the additional arguments to [`~UdopTokenizer.__call__`] and returns the output,
        together with the prepared `pixel_values`. In case [`UdopImageProcessor`] was initialized with `apply_ocr` set
        to `False`, it passes the words (`text`/``text_pair`) and `boxes` specified by the user along with the
        additional arguments to [`~UdopTokenizer.__call__`] and returns the output, together with the prepared
        `pixel_values`.

        Alternatively, one can pass `text_target` and `text_pair_target` to prepare the targets of UDOP.

        Please refer to the docstring of the above two methods for more information.
        """
        # verify input
        output_kwargs = self._merge_kwargs(
            UdopProcessorKwargs,
            tokenizer_init_kwargs=self.tokenizer.init_kwargs,
            **kwargs,
        )
        # for BC, handle unexpected positional arguments
        if len(args) > 0:
            logger.warning_once(
                f"Received unexpected positional arguments. These will be mapped accordingly to `text_pair`."
            )
            if len(args) == 1:
                # if there's one extra positional argument, assume it's `text_pair` for backward compatibility
                output_kwargs['text_kwargs']['text_pair'] = args[0]

which feels a bit less hacky if it indeed works. It would also allow to not add extra placeholder args when we have more than one extra arg to take care of, as in this cool work: #32180
However I will take my annual holidays soon so I won't be able to decide more on that, I'll trust you to move on with something incredible anyways as you've done amazing work already @yonigozlan @zucchini-nlp 💜

@zucchini-nlp
Copy link
Member

zucchini-nlp commented Aug 12, 2024

Btw, i just realized we are swapping input args order: it was text-first and now it will be image-first. AFAIK most people are used to pass text and then image in LLaVa models, without indicating the arg-name (inputs = processor(prompt, raw_image, return_tensors="pt").to(device, torch.float16)). So we might need to catch those cases

UPDATE: also, our slow tests for llava (not sure about others) also don't follow the new order, so we should update them

@yonigozlan yonigozlan force-pushed the uniformize-image-text-to-text-processors-kwargs branch from 02a4ec3 to 76bb138 Compare August 13, 2024 16:33
@yonigozlan
Copy link
Member Author

So this PR is getting bigger than I anticipated :). I think it's close to ready to be merged so I re-requested reviews, but maybe I should break it up in smaller PRs first? cc @zucchini-nlp @molbap

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, LGTM!

docs/source/en/model_doc/fuyu.md Show resolved Hide resolved
PreTrainedTokenizerBase,
PreTrainedTokenizerFast,
UdopProcessor,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UdpoProcessor is imported below if pytesseract is available, so imo we don't need to add it here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it from the pytesseract check instead, as there is a strange bug where the line below the class definition (processor_class = UdopProcessor) will still be executed even if the "requires" are not satisfied, which makes the CI break

Copy link
Member

@zucchini-nlp zucchini-nlp Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean UDOP has no dependency on pytesseract to run the processor test and will run successfully?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the image_processor (LayoutLMv3ImageProcessor) depends on pytesseract, but since the import check is already done at the level of LayoutLMv3ImageProcessor, it doesn't seem to me that it should also be done when importing the processor? Though I'm not sure how these nested requirement checks should be dealt with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, that's weird because the Tester class has a require_pytesseract dependency which afaik is same as is_pytesseract_available()

Actually only importing processor has no dependencies, from what I see it doesn't use pytesseract directly. So it should be ok to import it as is, and the tests should be skipped by require_pytesseract if package is not installed. I'm just curious why that broke, if you have bandwidth to explore that. I couldn't reproduce it by removing UdopProcessor from general imports

@molbap
Copy link
Contributor

molbap commented Aug 16, 2024

As noted I'm not a fan of the use of audio to alleviate text_pair and such - this hack is used in #32841 and #32845 and I think it'd be setting the wrong precedent in the codebase, all the rest is perfectly fine

def model_input_names(self):
return ["input_ids", "bbox", "attention_mask", "pixel_values"]
return ["pixel_values", "input_ids", "attention_mask", "bbox"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noted: why change the order here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the returned object of UdopProcessor from a BatchEncoding to a BatchFeature by updating the encoded images with the encoded text, and not the other way around, which changed the order of the output keys

@yonigozlan yonigozlan force-pushed the uniformize-image-text-to-text-processors-kwargs branch 3 times, most recently from 4e23ee7 to f5d8507 Compare September 16, 2024 19:48
@yonigozlan
Copy link
Member Author

Removed Udop from this PR as it has some specific args to handle, so waiting on this #33479 to be merged before opening another PR for it.
Meanwhile, this PR is ready for review!

@yonigozlan yonigozlan force-pushed the uniformize-image-text-to-text-processors-kwargs branch from cb961ad to e6ceb28 Compare September 20, 2024 16:08
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work ❤️

Great to see the combine efforts to make a clean processor interface being propagated to clean up the codebase 🧹

Just a few small comments - main ones about the commented out code


# @require_vision
# @require_torch
# def test_tokenizer_defaults_preserved_by_kwargs(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To uncomment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they can even be removed, forgot to do it thanks


# @require_vision
# @require_torch
# def test_kwargs_overrides_default_tokenizer_kwargs(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here?

@@ -179,261 +179,3 @@ def test_model_input_names(self):
list(inputs.keys()),
["input_ids", "attention_mask", "qformer_input_ids", "qformer_attention_mask", "pixel_values"],
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much code deletion 🤩

Comment on lines 172 to 173
# Temporary fix for "paddding_side" in init_kwargs
_ = output_kwargs["text_kwargs"].pop("padding_side", None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed? I can't remember the state of the solution for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it shouldn't be needed anymore! Thanks for catching that :)

return_tensors=return_tensors if images is None else None,
**kwargs,
output_kwargs["text_kwargs"]["add_special_tokens"] = (
output_kwargs["text_kwargs"]["add_special_tokens"] and add_eos_token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is matching the logic above but this seems like it would produce some very surprising behaviour 👀 (not a comment saying you should change things, just noting)

images=[lowres_img, cats_image], text=[self.prompt, self.prompt], return_tensors="pt", padding=True
).to(torch_device)

model.train()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set to training mode here? Is there an assertion on right padding because of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what happened here as I don't think I've made those changes 😅, maybe the rebase went wrong at some point. I will remove all that.

images=[lowres_img, cats_image], text=[self.prompt, self.prompt], return_tensors="pt", padding=True
).to(torch_device)

model.train()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same q here about forcing into training mode

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! :D

@yonigozlan yonigozlan merged commit 5f0c181 into huggingface:main Sep 25, 2024
18 checks passed
avishaiElmakies pushed a commit to avishaiElmakies/transformers that referenced this pull request Sep 25, 2024
* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Oct 2, 2024
* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino
ArthurZucker added a commit that referenced this pull request Oct 10, 2024
* add sdpa to OPT

* chore: remove redundant whitespace in OPTDecoder class

* fixup

* bug fix

* add sdpa and attention generate test

* fixup

* Refactor OPTAttention forward method for improved readability and maintainability

* undo refactor for _shape and key,val states

* add OPT to doc, fixup didn't find it for some reason

* change order

* change default attn_implemntation in testing to eager

* [run-slow] opt

* change test_eager_matches_sdpa_generate to the one llama

* Update default attention implementation in testing common

* [run-slow] opt

* remove uneeded print

* [run-slow] opt

* refactor model testers to have attn_implementation="eager"

* [run-slow] opt

* convert test_eager_matches_sdpa_generate to opt-350M

* bug fix when creating mask for opt

* [run-slow] opt

* if layer head mask default to eager

* if head mask is not none fall to eager

* [run-slow] opt

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: amyeroberts <[email protected]>

* Clean up Unpack imports (#33631)

clean up Unpack imports

* Fix DPT /Dinov2 sdpa regression on main (#33660)

* fallback to eager if output attentions.

* fix copies

* handle dependency errors in check_imports (#33622)

* handle dependency errors in check_imports

* change log level to warning

* add back self.max_position_embeddings = config.max_position_embeddings (#33550)

* add back self.max_position_embeddings = config.max_position_embeddings

* fix-copies

* Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (#33613)

fix llavaqwen2 model conversion

* Uniformize kwargs for Udop processor and update docs (#33628)

* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop

* Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin`  (#33203)

* Enable BNB multi-backend support (#31098)

* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <[email protected]>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <[email protected]>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <[email protected]>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <[email protected]>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <[email protected]>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <[email protected]>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <[email protected]>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <[email protected]>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <[email protected]>
Co-authored-by: Titus von Koeller <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>

* Fix error string after refactoring into get_chat_template (#33652)

* Fix error string after refactoring into get_chat_template

* Take suggestion from CR

Co-authored-by: Matt <[email protected]>

---------

Co-authored-by: Matt <[email protected]>

* uniformize git processor (#33668)

* uniformize git processor

* update doctring

* Modular `transformers`: modularity and inheritance for new model additions (#33248)

* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <[email protected]>

* Fix CIs post merging modular transformers (#33681)

update

* Fixed docstring for cohere model regarding unavailability of prune_he… (#33253)

* Fixed docstring for cohere model regarding unavailability of prune_head() methods

The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.

* Update src/transformers/models/cohere/modeling_cohere.py

---------

Co-authored-by: Lysandre Debut <[email protected]>

* Generation tests: update imagegpt input name, remove unused functions (#33663)

* Improve Error Messaging for Flash Attention 2 on CPU (#33655)

Update flash-attn error message on CPU

Rebased to latest branch

* Gemma2: fix config initialization (`cache_implementation`) (#33684)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used (#33556)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used

* Fixed formatting with `ruff`.

* Uniformize kwargs for image-text-to-text processors (#32544)

* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino

* 🚨🚨 Setting default behavior of assisted decoding (#33657)

* tests: fix pytorch tensor placement errors (#33485)

This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: #33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <[email protected]>

* bump tokenizers, fix added tokens fast (#32535)

* update based on tokenizers release

* update

* nits

* update

* revert re addition

* don't break that yet

* fmt

* revert unwanted

* update tokenizers version

* update dep table

* update

* update in conversion script as well

* some fix

* revert

* fully revert

* fix training

* remove set trace

* fixup

* update

* update

* [Pixtral] Improve docs, rename model (#33491)

* Improve docs, rename model

* Fix style

* Update repo id

* fix code quality after merge

* HFQuantizer implementation for compressed-tensors library (#31704)

* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <[email protected]>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <[email protected]>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <[email protected]>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Dipika <[email protected]>

* update model card for opt

* add batch size to inference table

* [slow-run] opt

* [run-slow] opt

---------

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Co-authored-by: Avishai Elmakies <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Pablo Montalvo <[email protected]>
Co-authored-by: chengchengpei <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Yoni Gozlan <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: jiqing-feng <[email protected]>
Co-authored-by: Aarni Koskela <[email protected]>
Co-authored-by: Titus von Koeller <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: Tibor Reiss <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Lysandre Debut <[email protected]>
Co-authored-by: Muhammad Naufil <[email protected]>
Co-authored-by: sizhky <[email protected]>
Co-authored-by: Umar Butler <[email protected]>
Co-authored-by: Jonathan Mamou <[email protected]>
Co-authored-by: Dmitry Rogozhkin <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: Arthur Zucker <[email protected]>
Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: George Ohashi <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Dipika <[email protected]>
NielsRogge added a commit to NielsRogge/transformers that referenced this pull request Oct 21, 2024
* add sdpa to OPT

* chore: remove redundant whitespace in OPTDecoder class

* fixup

* bug fix

* add sdpa and attention generate test

* fixup

* Refactor OPTAttention forward method for improved readability and maintainability

* undo refactor for _shape and key,val states

* add OPT to doc, fixup didn't find it for some reason

* change order

* change default attn_implemntation in testing to eager

* [run-slow] opt

* change test_eager_matches_sdpa_generate to the one llama

* Update default attention implementation in testing common

* [run-slow] opt

* remove uneeded print

* [run-slow] opt

* refactor model testers to have attn_implementation="eager"

* [run-slow] opt

* convert test_eager_matches_sdpa_generate to opt-350M

* bug fix when creating mask for opt

* [run-slow] opt

* if layer head mask default to eager

* if head mask is not none fall to eager

* [run-slow] opt

* Update src/transformers/models/opt/modeling_opt.py

Co-authored-by: amyeroberts <[email protected]>

* Clean up Unpack imports (huggingface#33631)

clean up Unpack imports

* Fix DPT /Dinov2 sdpa regression on main (huggingface#33660)

* fallback to eager if output attentions.

* fix copies

* handle dependency errors in check_imports (huggingface#33622)

* handle dependency errors in check_imports

* change log level to warning

* add back self.max_position_embeddings = config.max_position_embeddings (huggingface#33550)

* add back self.max_position_embeddings = config.max_position_embeddings

* fix-copies

* Fix Llava conversion for LlavaQwen2ForCausalLM with Clip vision tower (huggingface#33613)

fix llavaqwen2 model conversion

* Uniformize kwargs for Udop processor and update docs (huggingface#33628)

* Add optional kwargs and uniformize udop

* cleanup Unpack

* nit Udop

* Generation: deprecate `PreTrainedModel` inheriting from `GenerationMixin`  (huggingface#33203)

* Enable BNB multi-backend support (huggingface#31098)

* enable cpu bnb path

* fix style

* fix code style

* fix 4 bit path

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <[email protected]>

* add multi backend refactor tests

* fix style

* tweak 4bit quantizer + fix corresponding tests

* tweak 8bit quantizer + *try* fixing corresponding tests

* fix dequant bnb 8bit

* account for Intel CPU in variability of expected outputs

* enable cpu and xpu device map

* further tweaks to account for Intel CPU

* fix autocast to work with both cpu + cuda

* fix comments

* fix comments

* switch to testing_utils.torch_device

* allow for xpu in multi-gpu tests

* fix tests 4bit for CPU NF4

* fix bug with is_torch_xpu_available needing to be called as func

* avoid issue where test reports attr err due to other failure

* fix formatting

* fix typo from resolving of merge conflict

* polish based on last PR review

Co-authored-by: Marc Sun <[email protected]>

* fix CI

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/integrations/integration_utils.py

Co-authored-by: Arthur <[email protected]>

* fix error log

* fix error msg

* add \n in error log

* make quality

* rm bnb cuda restriction in doc

* cpu model don't need dispatch

* fix doc

* fix style

* check cuda avaliable in testing

* fix tests

* Update docs/source/en/model_doc/chameleon.md

Co-authored-by: Marc Sun <[email protected]>

* Update docs/source/en/model_doc/llava_next.md

Co-authored-by: Aarni Koskela <[email protected]>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <[email protected]>

* Update tests/quantization/bnb/test_4bit.py

Co-authored-by: Aarni Koskela <[email protected]>

* fix doc

* fix check multibackends

* fix import sort

* remove check torch in bnb

* docs: update bitsandbytes references with multi-backend info

* docs: fix small mistakes in bnb paragraph

* run formatting

* reveret bnb check

* move bnb multi-backend check to import_utils

* Update src/transformers/utils/import_utils.py

Co-authored-by: Aarni Koskela <[email protected]>

* fix bnb check

* minor fix for bnb

* check lib first

* fix code style

* Revert "run formatting"

This reverts commit ac108c6.

* fix format

* give warning when bnb version is low and no cuda found]

* fix device assignment check to be multi-device capable

* address akx feedback on get_avlbl_dev fn

* revert partially, as we don't want the function that public, as docs would be too much (enforced)

---------

Co-authored-by: Aarni Koskela <[email protected]>
Co-authored-by: Titus von Koeller <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>

* Fix error string after refactoring into get_chat_template (huggingface#33652)

* Fix error string after refactoring into get_chat_template

* Take suggestion from CR

Co-authored-by: Matt <[email protected]>

---------

Co-authored-by: Matt <[email protected]>

* uniformize git processor (huggingface#33668)

* uniformize git processor

* update doctring

* Modular `transformers`: modularity and inheritance for new model additions (huggingface#33248)

* update exampel

* update

* push the converted diff files for testing and ci

* correct one example

* fix class attributes and docstring

* nits

* oups

* fixed config!

* update

* nitd

* class attributes are not matched against the other, this is missing

* fixed overwriting self.xxx now onto the attributes I think

* partial fix, now order with docstring

* fix docstring order?

* more fixes

* update

* fix missing docstrings!

* examples don't all work yet

* fixup

* nit

* updated

* hick

* update

* delete

* update

* update

* update

* fix

* all default

* no local import

* fix more diff

* some fix related to "safe imports"

* push fixed

* add helper!

* style

* add a check

* all by default

* add the

* update

* FINALLY!

* nit

* fix config dependencies

* man that is it

* fix fix

* update diffs

* fix the last issue

* re-default to all

* alll the fixes

* nice

* fix properties vs setter

* fixup

* updates

* update dependencies

* make sure to install what needs to be installed

* fixup

* quick fix for now

* fix!

* fixup

* update

* update

* updates

* whitespaces

* nit

* fix

* simplify everything, and make it file agnostic (should work for image processors)

* style

* finish fixing all import issues

* fixup

* empty modeling should not be written!

* Add logic to find who depends on what

* update

* cleanup

* update

* update gemma to support positions

* some small nits

* this is the correct docstring for gemma2

* fix merging of docstrings

* update

* fixup

* update

* take doc into account

* styling

* update

* fix hidden activation

* more fixes

* final fixes!

* fixup

* fixup instruct  blip video

* update

* fix bugs

* align gemma2 with the rest as well

* updats

* revert

* update

* more reversiom

* grind

* more

* arf

* update

* order will matter

* finish del stuff

* update

* rename to modular

* fixup

* nits

* update makefile

* fixup

* update order of the checks!

* fix

* fix docstring that has a call inside

* fiix conversion check

* style

* add some initial documentation

* update

* update doc

* some fixup

* updates

* yups

* Mostly todo gimme a minut

* update

* fixup

* revert some stuff

* Review docs for the modular transformers (huggingface#33472)

Docs

* good update

* fixup

* mmm current updates lead to this code

* okay, this fixes it

* cool

* fixes

* update

* nit

* updates

* nits

* fix doc

* update

* revert bad changes

* update

* updates

* proper update

* update

* update?

* up

* update

* cool

* nits

* nits

* bon bon

* fix

* ?

* minimise changes

* update

* update

* update

* updates?

* fixed gemma2

* kind of a hack

* nits

* update

* remove `diffs` in favor of `modular`

* fix make fix copies

---------

Co-authored-by: Lysandre Debut <[email protected]>

* Fix CIs post merging modular transformers (huggingface#33681)

update

* Fixed docstring for cohere model regarding unavailability of prune_he… (huggingface#33253)

* Fixed docstring for cohere model regarding unavailability of prune_head() methods

The docstring mentions that cohere model supports prune_heads() methods. I have fixed the docstring by explicitly mentioning that it doesn't support that functionality.

* Update src/transformers/models/cohere/modeling_cohere.py

---------

Co-authored-by: Lysandre Debut <[email protected]>

* Generation tests: update imagegpt input name, remove unused functions (huggingface#33663)

* Improve Error Messaging for Flash Attention 2 on CPU (huggingface#33655)

Update flash-attn error message on CPU

Rebased to latest branch

* Gemma2: fix config initialization (`cache_implementation`) (huggingface#33684)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used (huggingface#33556)

* Fix ByteLevel alphabet missing when Sequence pretokenizer is used

* Fixed formatting with `ruff`.

* Uniformize kwargs for image-text-to-text processors (huggingface#32544)

* uniformize FUYU processor kwargs

* Uniformize instructblip processor kwargs

* Fix processor kwargs and tests Fuyu, InstructBlip, Kosmos2

* Uniformize llava_next processor

* Fix save_load test for processor with chat_template only as extra init args

* Fix import Unpack

* Fix Fuyu Processor import

* Fix FuyuProcessor import

* Fix FuyuProcessor

* Add defaults for specific kwargs kosmos2

* Fix Udop to return BatchFeature instead of BatchEncoding and uniformize kwargs

* Add tests processor Udop

* remove Copied from in processing Udop as change of input orders caused by BatchEncoding -> BatchFeature

* Fix overwrite tests kwargs processors

* Add warnings and BC for changes in processor inputs order, change docs, add BC for text_pair as arg for Udop

* Fix processing test fuyu

* remove unnecessary pad_token check in instructblip ProcessorTest

* Fix BC tests and cleanup

* FIx imports fuyu

* Uniformize Pix2Struct

* Fix wrong name for FuyuProcessorKwargs

* Fix slow tests reversed inputs align fuyu llava-next, change udop warning

* Fix wrong logging import udop

* Add check images text input order

* Fix copies

* change text pair handling when positional arg

* rebase on main, fix imports in test_processing_common

* remove optional args and udop uniformization from this PR

* fix failing tests

* remove unnecessary test, fix processing utils and test processing common

* cleanup Unpack

* cleanup

* fix conflict grounding dino

* 🚨🚨 Setting default behavior of assisted decoding (huggingface#33657)

* tests: fix pytorch tensor placement errors (huggingface#33485)

This commit fixes the following errors:
* Fix "expected all tensors to be on the same device" error
* Fix "can't convert device type tensor to numpy"

According to pytorch documentation torch.Tensor.numpy(force=False)
performs conversion only if tensor is on CPU (plus few other restrictions)
which is not the case. For our case we need force=True since we just
need a data and don't care about tensors coherency.

Fixes: huggingface#33517
See: https://pytorch.org/docs/2.4/generated/torch.Tensor.numpy.html

Signed-off-by: Dmitry Rogozhkin <[email protected]>

* bump tokenizers, fix added tokens fast (huggingface#32535)

* update based on tokenizers release

* update

* nits

* update

* revert re addition

* don't break that yet

* fmt

* revert unwanted

* update tokenizers version

* update dep table

* update

* update in conversion script as well

* some fix

* revert

* fully revert

* fix training

* remove set trace

* fixup

* update

* update

* [Pixtral] Improve docs, rename model (huggingface#33491)

* Improve docs, rename model

* Fix style

* Update repo id

* fix code quality after merge

* HFQuantizer implementation for compressed-tensors library (huggingface#31704)

* Add compressed-tensors HFQuantizer implementation

* flag serializable as False

* run

* revive lines deleted by ruff

* fixes to load+save from sparseml, edit config to quantization_config, and load back

* address satrat comment

* compressed_tensors to compressed-tensors and revert back is_serializable

* rename quant_method from sparseml to compressed-tensors

* tests

* edit tests

* clean up tests

* make style

* cleanup

* cleanup

* add test skip for when compressed tensors is not installed

* remove pydantic import + style

* delay torch import in test

* initial docs

* update main init for compressed tensors config

* make fix-copies

* docstring

* remove fill_docstring

* Apply suggestions from code review

Co-authored-by: Marc Sun <[email protected]>

* review comments

* review comments

* comments - suppress warnings on state dict load, tests, fixes

* bug-fix - remove unnecessary call to apply quant lifecycle

* run_compressed compatability

* revert changes not needed for compression

* no longer need unexpected keys fn

* unexpected keys not needed either

* Apply suggestions from code review

Co-authored-by: Marc Sun <[email protected]>

* add to_diff_dict

* update docs and expand testing

* Update _toctree.yml with compressed-tensors

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <[email protected]>

* update doc

* add note about saving a loaded model

---------

Co-authored-by: George Ohashi <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Dipika <[email protected]>

* update model card for opt

* add batch size to inference table

* [slow-run] opt

* [run-slow] opt

---------

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Co-authored-by: Avishai Elmakies <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Pablo Montalvo <[email protected]>
Co-authored-by: chengchengpei <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Yoni Gozlan <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: jiqing-feng <[email protected]>
Co-authored-by: Aarni Koskela <[email protected]>
Co-authored-by: Titus von Koeller <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>
Co-authored-by: Tibor Reiss <[email protected]>
Co-authored-by: Matt <[email protected]>
Co-authored-by: Lysandre Debut <[email protected]>
Co-authored-by: Muhammad Naufil <[email protected]>
Co-authored-by: sizhky <[email protected]>
Co-authored-by: Umar Butler <[email protected]>
Co-authored-by: Jonathan Mamou <[email protected]>
Co-authored-by: Dmitry Rogozhkin <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: Arthur Zucker <[email protected]>
Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: George Ohashi <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Sara Adkins <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Co-authored-by: Dipika <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants